Skip to main content

S3 Vectors

Summary

This document covers the information to gather from S3 Vectors in order to configure a Qarbine data service. The data service will use the Qarbine AWS_S3_Vectors driver. You can define multiple data services that access the same S3 Vectors bucket though with varying credentials. Once a data service is defined, you can manage which Qarbine principals have access to it and its associated objects. A Qarbine administrator has visibility to all data services. More information on S3 Vectors can be found at https://aws.amazon.com/s3/features/vectors/ and https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html.

S3 Vectors Configuration

Overview

Qarbine uses its AWS S3 Vectors driver to interact with S3 Vectors buckets and objects. There are several parameters to obtain to configure interaction:

  • Region,
  • Access key ID,
  • Secret access key, and
  • Specific bucket of interest (optional).

The region can be determined in the upper right on the AWS console.

  

To define access credentials:

  1. Go to the AWS Management Console.
  1. Navigate to the IAM (Identity and Access Management) service.
  1. Create a new user or use an existing one.
  1. Attach a policy that grants access to your S3 Vectors buckets.
  1. After creating the user, download or copy the Access Key ID and Secret Access Key into a temporary location.

Also review the other security settings in your bucket.

Details on IAM policies related to S3 Vectors can be found at
https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-iam-policies.html and https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-access-management.html.

The table below lists many of the related S3 Vectors scopes of interest.

Action Qarbine Usage
GetIndexRequired
GetVectorBucketRequired
GetVectorBucketPolicyRequired
GetVectorsRequired
QueryVectorsRequired
ListIndexesRequired
ListVectorBucketsRequired
ListVectorsRequired
Create*Not accessible
Delete*Not accessible
Put*Not accessible

You can configure Qarbine to only present a single bucket in the tools or all those accessible to the associated IAM sign on. A listing of your S3 Vectors buckets can be seen on the AWS Console by navigating to the highlighted option shown below.

  

Qarbine Configuration

Compute Node Preparation

Determine which compute node service endpoint you want to run this data access from. That URL will go into the Data Service’s Compute URL field. Its form is “https://domain:port/dispatch”. A sample is shown below.

  

The port number corresponds to a named service endpoint configured on the given target host. For example, the primary compute node usually is set to have a ‘main’ service. That service’s configuration is defined in the ˜./qarbine.service/config/service.main.json file. Inside that file the following driver entry is required

"drivers" :[
. . .
"./driver/awsS3VectorsDriver.js"
]

The relevant configuration file name for non primary (main) Qarbine compute nodes is service.NAME.json. Remember to have well formed JSON syntax or a startup error is likely to occur. If you end up adding that entry then restart the service via the general command line syntax

pm2 restart <service>

For example,

pm2 restart main

or simply

pm2 restart all

Data Service Definition

Open the Administration Tool.

Navigate to the Data Services tab.

  

A data service defines on what compute node a query will run by default along with the means to reach to target data. The latter includes which native driver to use along with settings corresponding to that driver. Multiple Data Sources can reference a single Data Service. The details of any one Data Service are thus maintained in one spot and not spread out all over the place in each Data Source. The latter is a maintenance and support nightmare.

To begin adding a data service click

  

On the right hand side enter a name and optionally a description.

Set the name and any description

  

Set the Compute URL field based on the identified compute node above. Its form is “https://domain:port/dispatch”. A sample is shown below.

  

Set the driver

  

You can leave the server template empty to use the default S3 Vectors endpoint based on your region setting in the server options.

  

Information on S3 Vectors endpoints can be found at
https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-regions-quotas.html

Set the server options such as

region=us-east-1,
accessKeyId = "1234567890",
secretAccessKey="abcdefghijklmnop12345678901234567890"

The latter 2 values are from the IAM information you obtained in the first section. For example,

  

You can limit the bucket shown in the Qarbine tools by setting the generic “Database” field.

  

The associated IAM policy should be the primary bucket visibility and interaction enforcement mechanism.

Test your settings by clicking on the toolbar image highlighted below.

  

The result should be similar to the following.

  

Save the Data Service by clicking on the image highlighted below.

  

The data service will be known at the next log on time. Next, see the AWS S3 Vectors query interaction and any tutorial for information on interacting with S3 Vectors from Qarbine.

Sample Bucket Specific Data Service

Here are the primary settings for this scenario.

  

  

The verification dialog is shown below.

  

The effect in the Data Source Designer is shown below.

  

Without the “Database” (the S3 Vectors bucket) setting the Data Source Designer drop down would list all accessible vector buckets for the associated IAM sign on.